Randomness in the choice of neighbours promotes cohesion in mobile animal groups

Classic computational models of collective motion suggest that simple local averaging rules can promote many observed group-level patterns. Recent studies, however, suggest that rules simpler than local averaging may be at play in real organisms; for example, fish stochastically align towards only one randomly chosen neighbour and yet the schools are highly polarized. Here, we ask—how do organisms maintain group cohesion? Using a spatially explicit model, inspired from empirical investigations, we show that group cohesion can be achieved in finite groups even when organisms randomly choose only one neighbour to interact with. Cohesion is maintained even in the absence of local averaging that requires interactions with many neighbours. Furthermore, we show that choosing a neighbour randomly is a better way to achieve cohesion than interacting with just its closest neighbour. To understand how cohesion emerges from these random pairwise interactions, we turn to a graph-theoretic analysis of the underlying dynamic interaction networks. We find that randomness in choosing a neighbour gives rise to well-connected networks that essentially cause the groups to stay cohesive. We compare our findings with the canonical averaging models (analogous to the Vicsek model). In summary, we argue that randomness in the choice of interacting neighbours plays a crucial role in achieving cohesion.


Introduction
Organisms that live in social groups often exhibit collective motion, which is important for achieving functions like foraging, navigation, evasion from predation, etc. [1][2][3][4]. To explain the highly coordinated motion of such animal groups, classic models of collective motion assume that an agent moves along the average direction of motion of all neighbours that are within a short metric distance around it [5]. Subsequent models extend on these ideas to incorporate cohesion [6][7][8]; they assume that agents also move towards an average direction determined by the location of all nearby individuals. Broadly, theory and computational studies predict that non-trivial group-level phenomena can emerge even when organisms follow such simple rules that depend on the states of their neighbours [5,6,[9][10][11][12][13][14][15].
Surprisingly, recent empirical studies [16][17][18][19][20] show that organisms interact through rules that are probably much simpler than averaging information of several individuals; in fact, each organism may interact with just a single randomly chosen neighbour (termed stochastic pairwise interaction) to achieve high levels of group polarization [18]. This order, counterintuitively, can emerge from sampling biases in the interactions owing to the finite size of the group. Consequently, once the group is in a polarized state, it continues to reside in that state for a long time [21]. To maintain group polarization, group cohesion is a must. However, the mechanisms that keep the group cohesive-in particular the role of stochastic decision making-are not as well explored.
Traditionally, to explain group cohesion, computational models assume that organisms interact with all individuals within a fixed metric distance [6,22,23]. However, several empirical investigations [24][25][26] have shown that organisms, in fact, interact with a select few, referred to as topological neighbours, and are not strictly limited by metric distances (say, seven nearest neighbours in the case of starling flocks [24]). They argue that such topological interactions provide substantially better cohesion than metric distance-based rules. In some fish schools, interaction with the nearest one appears to be sufficient in producing the observed cohesion [27]. In another species [17], fish appear to choose the most influential one among their neighbours to maintain cohesion. In echolocating organisms like bats, it is challenging to detect the neighbours as the returning echoes are faint and are probably masked by the loud calls of their neighbours. Consequently, in large groups, bats may only detect one neighbour at a time [28]; yet roosting bats manage to maintain cohesion even in large mobile groups. While we do expect species-specific behavioural rules at fine scales, one broad question arises at this point: how does group cohesion depend on the choice of neighbours? In this context, we note that real organisms' behaviours are probably stochastic. While computational models do include an element of randomness for the motion of organisms, they typically ignore this in the context of choice of interacting neighbours (but see [29,30]). Here, we reveal the surprising role of randomness in the choice of neighbours in maintaining group cohesion.
In this article, we investigate how cohesion emerges from stochastic attraction interactions using a spatially explicit agent-based model. We study mobile groups made up of individuals that interact with just one neighbour at a time. We explore a class of interaction models that differ only in the way the organism chooses its neighbour to interact with-based on randomness in choice of neighbours. To understand how local interactions lead to cohesion at the group level, we reconstruct the underlying interaction network and employ a graph-theoretic analysis to study the properties of the network in light of how it is linked to the group's ability to stay cohesive. We compare our findings with the canonical equivalents of the averaging interactions to explain how simpler interactions are sufficient in achieving cohesion.

Agent-based spatially explicit model
We develop agent-based models in two spatial dimensions (2D) to study the dynamics of collective motion. While these models broadly follow the principles of classic self-propelled particle models of collective motion [6,13,31], we make a key modification-an agent chooses its neighbours to interact with, randomly from a set of nearby visible neighbours. Our objective in this study is to understand how such a choice affects the emergence of cohesion in mobile animal groups. We carry out our analysis with a detailed model that is motivated by recent empirical studies [16,18,27,32] which considers probabilistic interaction rules, asynchronous updating, variable speed of agents and collision avoidance. To ascertain the generality of our findings, we repeat our analyses with a variety of models that include variants of the detailed model where key assumptions are relaxed. We also investigate a minimal-version of the model where agents are just point particles moving with a constant velocity while interacting probabilistically; this minimal model has a complexity comparable to that of the canonical Vicsek model.
In this section, we only outline key model features of the detailed agent-based model and relegate the detailed mathematical descriptions of all the models considered and their variants to appendices A and B.
royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 220124 Every organism i is described by its position in space (x i ), and velocity (speed s i and direction e i ). These are trivially related by the kinematic equation: _ x i ¼ s i e i . In our notation, while e i represents the unit orientation vector, /e i represents the angle of that orientation vector about the positive x-axis.
Agents update both their speed and direction of motion as they interact with other agents in their visually perceptible neighbourhood. The stochastic decision making process of organisms is captured by the way the behavioural interactions of the organisms are implemented via asynchronous update rules and the choice of neighbours, as described below.
Each organism asynchronously interacts to perform, (i) exactly one behavioural interaction at a given instant (with a propensity that depends on the rate of the interactions), (ii) at a unique time (sampled from an exponential distribution).
We assume that behavioural interactions between agents depend on topological neighbourhood [24]: organisms perceive K nearest neighbours from their visual zone. They integrate information-of orientation and position-from randomly chosen k of the K (k ≤ K) perceived neighbours for both the alignment and attraction interactions (figure 1). Our explorations show that choosing different neighbourhood sizes for alignment and attraction interactions do not alter our conclusions.
Four key behavioural rules are: (i) alignment interaction: at a rate r p , an agent copies the speed and direction of motion of agents present in its topological neighbourhood; (ii) attraction interaction: at a rate r c , an agent moves towards the centre of mass of its nearby topological neighbours with speed dependent on how far the organism is from this neighbourhood; (iii) spontaneous turning: at a rate r s , an agent chooses a new direction /e d , sampled from a (circular) normal distribution with a mean /e i and a variance s 2 a . The speed of the individual is sampled from a (truncated) normal distribution with mean s 0 and a variance s 2 s . When the variances are small, spontaneous turning leads to a new direction of motion (with a new speed) but close to that of its previous direction; and (iv) collision avoidance: agents avoid collision with other agents as they move. We assume they do this by turning away from the other agent and by slowing down (as reported in [27]). royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 220124 3

Choice of neighbours and interaction types
Depending on the values of K and k, we can achieve a variety of interaction types. We primarily consider the following ones. We get a stochastic pairwise-interaction when we set k = 1; e.g. relevant to fishes [18,27] and bats [28] where organisms interact with only one randomly chosen neighbour. This results in the agent copying the direction of (alignment) or moving towards (attraction), a single but randomly chosen neighbouring agent. Here, K will determine the extent of the neighbourhood with which the organisms may interact.
Within the stochastic pairwise interaction models, when K = 1 an agent interacts with its nearest neighbour (referred to as nearest-neighbour-type). At the other extreme, when K = N − 1 an agent interacts with any agent from the entire group (where N is the size of the group).
Moving beyond the stochastic pairwise interactions, we also consider the Vicsek-like local averaging models by setting k = K (with K > 1). For the alignment interaction, this corresponds to a topological analogue of the widely studied Vicsek interaction [5]. For the attraction interactions, it mimics the topological analogue of the Boids [12] or the Couzin model [6]. Note here, that by setting k = K, focal agents interact with all the neighbours it can perceive at a given time-rendering the neighbour selection process 'deterministic' for that instant of time. However, the neighbours themselves change with time as the group moves.
The simulation is carried out in a two-dimensional, unbounded (non-periodic) domain. Further details on initial conditions, parameter values, number of replicates, etc., are described in appendix A. More information on the sensitivity of the collective dynamics to parameter values can be found in the electronic supplementary material, S2. Remarks comparing our model to some of the earlier models of collective motion are pertinent here. Most of the models assume that individuals interact with every other individual within a fixed metric distance. Further, they assume that all individuals interact and update their locations synchronously, moving at a constant speed [5,6,33]. Following many recent empirical studies, our model assumptions differ on these aspects. First, we assume a topological distance for interactions. However, unlike Ballerini et al. [24], who propose that birds interact with a fixed number of nearest neighbours, we assume that organisms may randomly choose any k of the K (≥k) nearest neighbours; we show this randomness plays an important role in maintaining group cohesion. We also assume that individuals move at variable speeds [16,[34][35][36] and update their motion asynchronously [29,30,37], in line with many recent empirical studies [16,18,27,32]. We reiterate that, in order to show generality of our conclusions, we also study variations of parameter values as well as variations of the model, including a minimal model that has substantially reduced number of parameters.
Note that the detailed model presented in this section is a non-trivial extension of the mean-field model discussed in a previous work from our group [18], which captures the polarization orderparameter fluctuations observed in real fishes. We further remark that role of randomness in the choice of neighbours and how it influences the collective behaviour of a flock has been studied before [21,38]. In [38], the random selection of neighbours is used in a network model to mimic the changing neighbours in the Vicsek model. However, these studies are limited to only how the agents align while we focus on group cohesion in this manuscript.

Group cohesivity and quantification
We first provide an intuitive picture of the group dynamics. When an agent is all by itself, the only interaction it undergoes is spontaneous turning, which changes its movement direction and speed at discrete points in time, resulting in random motion. When two or more agents are present, they begin to interact via alignment and attraction rules. In such a set-up, the agents in the front often behave as though they were isolated since there is a visual field that limits the organism's perception. These agents have a greater tendency to wander away from the group until a spontaneous turn changes its orientation, allowing it to 'see' a neighbour in the group; this may create the possibility for a successful attraction interaction at a later time that can bring it back to the group. At the same time, other group members may follow a wandering agent. When a few such individuals succeed in doing so, it could cause a cascade of many members leaving and eventually breaking a group into two or more smaller clusters. Thus, we investigate the interaction rules that maintain group cohesion. Before we do so, we need to define a way to quantify the cohesivity of groups.
Cohesivity of a mobile group can be quantified in a number of different ways that include characterizing the average near-neighbour distance [39], area of the convex hull of the spatial positions of the organisms and average distance to the barycenter of the group [17]. However, we choose a more stringent measure for cohesion based on how organisms are positioned in space with respect to their neighbours. Using a density-royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 220124 based spatial clustering approach (DBSCAN) [4,[40][41][42], we group the organisms into different clusters. Here, a cluster is defined as the set of all organisms, where every organism in the set is less than ε distance away from at least one other organism in the set. Then, we define a cohesion parameter (denoted as C) as the average size (in number of organisms) of the largest cluster in the group normalized by the total size of the group (averaged over time of the simulation and over multiple realizations). Hence, the value of C denotes the average fraction of organisms in the largest cluster in the group. A value of 1 means that the organisms were always cohesive, with no break-up events. Note that very small values of ε (smaller than the average distance between agents) will result in low values of C and very large values (of the order of the distance travelled by an agent in the time simulated) will result in high values for C. However, the results reported in the article do not change appreciably across a wide range of values for ε, since a non-cohesive group will drift away in open boundary simulations while a cohesive group will not (more details in the electronic supplementary material, S2-S4).

Simulations and sensitivity analysis
We simulate mobile groups of size ranging from N ∈ [5,50] and investigate the effect of the size of the neighbourhood an organism interacts with (K ∈ 1, …, N − 1). We first investigate the stochasticpairwise interactions, where an organism interacts with only one randomly chosen neighbour, i.e. k = 1 and then compare the findings with the canonical averaging interactions, by setting k = K.
The parameters used for the study are motivated by the empirical results from Jhawar et al. [18]. We analyse how sensitive our findings are to these parameters and metrics chosen for the study. We find that the qualitative features of group cohesion and its dependence on K are insensitive to: (i) broad range of the parameters used in the spatially explicit agent based model, (ii) the parameters in calculating C and alternate definitions for the cohesion parameter, and (iii) the several variants of the spatial model. More information on this can be found in the electronic supplementary material, S2 and S3.

Results and discussion
3.1. Random choice of interacting neighbour, even with one individual, promotes group cohesion When the number of individuals are small, which correspond to group sizes in many simple experiments of collective motion [16,27,32], individuals form groups and remain reasonably cohesive even when the interaction is of the near-neighbour-type, i.e. organisms align and attract with only the nearest neighbour (K = 1, k = 1; see figure 2a for N = 3, 5). However, as the size of the group increases, interacting with just royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 220124 the nearest neighbour is no longer sufficient. Groups begin to break into smaller clusters of size 2 or 3 where the interactions between the organisms are confined to within the cluster. These clusters eventually drift away (see inset corresponding to K = 1 of figure 2a).
With increasing size of the topological neighbourhood while still interacting with only one random neighbour, i.e. larger K but still with k = 1, cohesivity of the groups (C) increases, taking values close to 1. This indicates that organisms reside in one tightly knit cluster stably throughout all time (see figure 2b; also see the electronic supplementary material, video 1 for visualization). This is likely because the number of unique individuals the organisms interact with increases with K. By contrast, when organisms interact only with their nearest neighbour, it is likely that they are interacting with the same neighbour repeatedly (for more information, see the electronic supplementary material, S4).
We find the number of neighbours K required to achieve the same level of cohesion scale with group size N. Simply put, the proportion of the topological neighbours required to achieve a given level of group cohesion is independent of the group size N. We find the cohesion parameter to saturate when the ratio of the topological neighbourhood to the total group size reaches a threshold of approximately 0.3 (hashed region in figure 2b). We find that this threshold ratio reduces when organisms' speed (s 0 ) reduces, or with increasing rates of attraction (r a ) and alignment (r p ) (see the electronic supplementary material, S2).

Attraction interaction network reveals why cohesion emerges
To understand how cohesion emerges even from simple stochastic pairwise interactions, we turn to a graph-theoretic analysis of the underlying attraction interaction network between organisms in a group. We emphasize that we focus on attraction interactions rather than alignment interactions since our study is centred around how organisms maintain cohesion. While it is true that local directional alignment alone can also cause some degree of attraction, a major determining cause of group cohesion is the local attraction (see the electronic supplementary material, S5).
In our analysis, individual organisms can be considered as nodes, and a directed edge can be constructed from organisms i to j, whenever i exhibits an attraction interaction towards j. Since organisms interact asynchronously in our model, these edges are formed at distinct instants of time. Hence, to construct a graph that faithfully represents the underlying interaction network, we observe the different connections that arise between individuals over a time window t w . To choose an appropriate time scale t w for the analysis, we use the length and velocity scales in the system corresponding to the motion of organism required to break free from its associated cluster: ε (maximum distance between organisms belonging to the same cluster) and s 0 (desired speed of an individual). The time scale is then defined as Notice that if t w ≫ 1, then we would (at least in some cases) expect a network that is dense or fully connected since each organism would have interacted many numbers of times and if t w ≪ 1, the network would be sparse. Both these extremes would not represent the 'correct' interaction network responsible for cohesion in a mobile group. Figure 3a illustrates how the attraction interaction network is constructed over a time period t w . The network that emerges owing to the interactions is directed in nature; i.e. i → j does not imply j → i since each individual randomly and asynchronously chooses a neighbour to interact. We argue that this underlying directed-network encodes information pertaining to the group's cohesiveness. Although there are studies that investigate network properties in collective motion models [38,43,44], as far as we know, there are no off-the-shelf measures to characterize the interaction network to probe into why a group stays cohesive or breaks apart. In this section, we explore the correspondence between the properties of the network and the emergence of cohesion.
It is reasonable to expect a well-connected network to represent a cohesive group and a sparsely connected one to represent a non-cohesive group. A simple measure that characterizes how well a network is connected can be computed from an adjacency matrix A, where each element (i, j ) of A takes the value 1 when there is an edge connecting i to j. From A, the average connections for a node can be computed-which is simply the average number of neighbours an organism interacts with within time t w . Note here that the directed nature of the graph results in an A that is asymmetric.
However, for cohesion to emerge, organisms need not necessarily interact with every other organism in the group. An organism interacting with just a few of its immediate neighbours could result in a chain of royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 220124 events that lead to cohesion. To include this feature that arises not just from primary (or immediate) but from connections that are secondary, tertiary, etc., we compute the reachability matrixÃ, where an element takes the value 1 when there is a path from i to j, in the directed graph. To connect this property of the network to the group cohesivity, we divide a group into sub-groups based onÃ. A sub-group, in this context, is defined as a set of all organisms that have a path from and to every other organism in that sub-group. Then, we compute a network parameter N p that is the size of the largest sub-group (normalized by the size of the group), averaged over time and several realizations (see figure 3b for illustration; see appendix C for details on numerical computation and the electronic supplementary material, video 2 for a visualization of the network structure and corresponding group cohesion.).
We find that N p increases with the neighbourhood size K, in a manner qualitatively similar to the cohesion parameter C (see figure 3c; also see the electronic supplementary material, S6, where the network parameter is shown to describe the qualitative trends in C versus K consistently for different levels of group cohesion). When K increases, interactions between organisms result in a network that is well connected, i.e. there is a path from every organism to almost every other in the group, even when K ≃ 0.3 × N. This informs us that when organisms select individuals to interact with at random from a considerable topological neighbourhood, an opportunity is created for the group to stay cohesive.
However, an interaction network created need not always materialize into a cohesive group. An organism can, in principle, interact with another organism in a cluster far away (in space) to create a well-connected network since the interactions in our model are topological. However, other interactions like spontaneous turning, alignment or collisions, can break the network before it can cause the two clusters to come together. For this reason, we find N p to reach a high value (≃1) faster than C for most cases ( points over the diagonal in the inset of figure 3c). However, when group sizes are small (N ≤ 5) or when organisms break into small clusters (for the case of K = 1), we find N p to be lower than C ( points under the diagonal in the inset of figure 3c). These points refer to cases where the organisms are cohesive, but interactions are sparse, giving rise to a not fully connected network. Here, a considerable number of organisms reside in the periphery of the clusters that do not have royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 220124 visible neighbours to interact with and hence get isolated from the rest of their neighbours in the calculation of the network parameter. Hence, these clusters have a lower value for N p even though they are spatially in proximity to their local neighbours (electronic supplementary material, S6).

Cohesion owing to averaging interactions
In canonical models for flocking, like the Vicsek model for alignment, an agent often averages the information from a neighbourhood to find its direction of movement. Here we compare the group cohesivity achieved through stochastic pairwise interactions with that of the averaging-type interactions. We recall that while stochastic-pairwise interactions are achieved by setting k = 1 and K > 1, we obtain the topological averaging interaction (like the Vicsek model for alignment) by setting k = K.
We find averaging interactions also achieve cohesion, with cohesion parameter C increasing rapidly with the size of the neighbourhood, K (figure 4). We emphasize that while focal agents interact with all neighbours K in the averaging-type interactions (because k = K), stochastic pairwise interactions permit interaction with only one neighbour (k = 1). Averaging interaction, by definition, consumes information from all the interacting K neighbours, while a pairwise interaction takes in information from only one of its K neighbours at a time. Thus, it is not surprising that cohesion is achieved more rapidly in the local-averaging type interaction. This is in line with the results in the literature [6,26].
Interestingly, beyond a certain value of neighbourhood size K, both the averaging and the pairwise interactions produce similar (maximum) cohesion. Hence, organisms interacting via these two different interaction types will not have any additional advantage with regards to cohesivity. However, if we compared the neighbourhood sizes required to achieve a given value of cohesion (say, 0.75), then we observe that the averaging interaction can achieve that level of cohesion with less number of neighbours K a than a pairwise interaction K p (see horizontal line at C ¼ 0:75 in figure 4; also see the electronic supplementary material, video 3 for visualization). From the viewpoint of the organism's cognitive capacity, the choice is between: (i) assimilating information from all neighbours in a small neighbourhood of size K a and averaging them, or (ii) assimilating information of one neighbour from a larger set of K p neighbours. While it is known that the cognitive load required to track a large number of neighbours is high [45][46][47]; we do not yet know, which of these two processes have a smaller cognitive load.
However, one could safely assume that an organism capable of integrating information from multiple sources and limited by its ability to observe only a small part of its neighbourhood would prefer method (i) over method (ii), while a different organism that finds integrating information together difficult would choose (ii) to achieve the same level of cohesion.
An important question may arise at this point: are these two kinds of interactions, averaging and pairwise, truly distinct? More specifically: is it possible to produce an 'averaging-interaction' by merely applying pairwise interactions multiple times over different neighbours? This is an interesting question  royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 220124 with important consequences from the point of inferring the behavioural rules in organisms. In earlier work from our group [18,21], in the context of alignment, we show that pairwise copying (agent interacting with just one other) produces different jump moments in comparison to any higher order interaction (either three-agent interaction or averaging) which gives rise to distinct mean-field models (stochastic differential equations (SDE)) for the polarization order parameter (m). The fluctuations in the order parameter produced by pairwise copying can be explained with the following SDE: where a and c represent the rate of random turn and rate at which agents copy a randomly chosen neighbour, respectively and η(t) is Gaussian white noise. Here, the order emerges owing to the multiplicative noise term (that which multiplies the noise η(t)). While a higher order alignment interaction, occurring at rate h, which is an equivalent to averaging, can be modelled with the following SDE: dm=dt ¼ ½Àam þ hð1 À jmj 2 Þm þ Â ffiffiffiffiffiffiffiffiffi N À1 p ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðc þ hÞð1 À jmjÞ 2 þ a q Ã hðtÞ.
Here, order emerges owing to the deterministic part of the SDE. In these studies, we clearly demonstrate that these two types of interactions can be differentiated in empirical investigations by extracting the governing SDE from data. These results reveal the inherent qualitative differences between a pairwise and an averaging interaction: a pairwise interaction is symmetric and does not guarantee a net increase in the order parameter and so order emerges only because of noise, while a higher-order interaction like that of the averaging can guarantee net increase in order and can deterministically hold the system at an ordered state. We speculate that the same principles can be used to argue that the pairwise and averaged attraction interactions are distinct. We welcome further work in this area to identify measurable metrics that will differentiate a cohesive mobile group exhibiting averaging attraction interactions from one that interacts via pairwise attraction interactions.

Concluding remarks
In this study, using a spatially explicit agent-based model, we show that group-level cohesion can emerge when organisms move towards just one other randomly chosen nearby organism. We show that a random choice of the neighbour, rather than a fixed neighbour such as the nearest individual, considerably improves the group cohesion. Cohesion emerges even with such simple stochastic pairwise interactions because choosing a neighbour randomly creates a well-connected long-ranged interaction network. We show that the connectedness of the interaction network correlates well with the cohesivity of the mobile group. Constructing the interaction network was possible because we had complete access to all information pertaining to the interactions, their time-stamps and organisms-indices, owing to the theoretical nature of the work. In an experimental setting, it would be challenging to estimate the underlying network structure from data of organismal motion. In a recent study, a ray casting approach was used to identify a network based on the vision of individual fish [48]. This network had an edge connecting an organism to every other organism in its perceivable neighbourhood; not specific to attraction interactions (or any other). We believe that re-constructing the hidden interaction networks from movement data would be an exciting future direction for research.
However, do organisms really choose a random neighbour to interact with? A random neighbour could be chosen in many ways. For instance, an organism could prefer a faster-moving individual to interact with, over other slower-moving ones. Also, because organisms may move at different speeds during the course of their motion, which change continuously owing to spontaneous activity and collisions, this 'faster-individual' may be found anywhere within a neighbourhood of a certain size. Lei and co-workers argue that fish choose to interact with a few of their most-influential neighbours [17]. However, since the 'influence' a neighbour has on a fish is a function of its proximity, relative positions and orientations, which change continuously as fish move in a school [16], the most-influential fish could essentially take any position within the school at a given time: from the nearest neighbour to the farthest one. We speculate that choosing the most influential neighbour could be similar to choosing a fish randomly from a neighbourhood of size K.
In summary, our study shows that when an organism randomly chooses another to interact with, irrespective of specific mechanisms, it results in an interaction network that is well connected, giving rise to considerable group cohesion in small to intermediate group sizes. However, we expect large systems of collective motion to exhibit dynamic fission and fusion; since the topological neighbourhood an organism should perceive to maintain group cohesion will be much larger than what is biologically feasible. We welcome further research on empirically motivated and parametrized models of collective motion that account for stochastic decision-making of organisms with an royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 220124 emphasis on group cohesion, fission-fusion group dynamics and explore the functional significance of the role of heterogeneity between individuals in the group.
We employ a two-dimensional model to simulate the collective motion of a mobile group of organisms. Motion of an organism depends on its instantaneous velocity (given by its direction of heading and speed), which evolves in time as it undergoes various interactions or encounters any obstacle for motion. Equations (A 1) and (A 2) describe how an organism's heading and speed evolve in time: Every time an organism interacts, it gets a new desired direction e d,i of motion and a new desired speed s d,i and the organism tries to achieve its new desired velocity in a time scale corresponding to τ. The value of τ is related to its body mass and ability to turn/move quickly. Each organism aligns, attracts or turns spontaneously with rates r p , r c and r s , respectively. The time of the interaction is chosen stochastically: sampled from an exponential distribution with mean corresponding to the rate of the interaction. Δθ r,i and Δs r,i in equations (A 1) and (A 2) correspond to the changes in speed and direction of an organism owing to the presence of an obstacle; which could be another organism or a boundary. Note: an organism moves continuously in space and time, making continuous adjustments to its trajectory when approaching an obstacle. Changes to its desired velocity owing to interaction events occur only at discrete points in time.

A.2. Behavioural interactions of the organisms
In our simulations, we assume that organisms undergo three different types of interactions: spontaneous turning, alignment and attraction. In what follows we describe the mathematical details of these interactions.

A.2.1. Spontaneous turning
As an organism moves, it spontaneously turns at a mean rate r s , independent of its own state and the states of its neighbours. The angle they wish to turn to, is sampled from a (circular) normal distribution with mean zero and variance σ a as in equation (A 3): ðA 3Þ The speed of the individual is sampled from a normal distribution with mean s 0 , which is the desired velocity for movement of the organism and a variance σ s which quantifies the spread of the speeds of the individuals as observed in typical empirical data [

A.2.3. Attraction
An organism i moves towards another l, at an average rate, r c , to stay cohesive. It is reasonable to expect i to change its current trajectory to move towards l only when it is considerably far away from l. This feature is brought into the interaction term by using a distance based weighting f a,l , as shown in equation (A 9): We can also model the change in speed experienced by i in a similar manner: when i is far away from k, it moves faster (see equation (A 10)): The weighting function selected for use in our model has the form shown in equation (A 13). When γ is >2, it softens the effect of the attractive interaction within distance l around the organism l. Only when (‖r ik ‖ − 2 R) > l, the attractive interaction has a significant effect: A.

Collision avoidance
Organisms avoid collision with each other during collective motion. They turn towards a direction that is perpendicular to that connecting the organisms as shown in equation (A 14), to prevent collision events: Here, n r is the number of individuals that are within the zone of repulsion (zor): which is the critical distance between organisms below which they begin to respond to the proximity of their neighbours. In addition to turning, they also slow down as they approach an obstacle, as shown in equation (A 15). The rate of slowing down is inversely proportional to the distance between the fish and directly royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 220124 proportional to the rate of approach: The speed s i of an organism, is bound between 0 and s max , which is set to 1.0 cm s −1 in our study. This is enforced in both the spontaneous turning and attraction interactions.
A.2.6. Remark: local state-dependent behaviour We model organismal-movement using agents that interact with each other at a rate, intrinsic of the agent, i.e. independent of the local state of the group. Simply put, a fish aligns with a neighbouring fish, irrespective of whether the local neighbourhood is ordered or not. Similarly, a fish does not move towards its neighbour in response to its perception of the less cohesive neighbourhood. However, interestingly, the implementation of the alignment and attraction rules, bring an indirect local-statedependent behaviour. An organism aligning with a neighbour does not change the local order significantly when the local neighbourhood is sufficiently ordered. However, this change is significant when the neighbourhood is disordered. The weighting function f a,l as described in equation (A 13), makes sure that the effect of the attraction interaction is significant only when the organism is away from the local neighbour it is interacting with. Even though organisms interact in a manner, independent of the local state of the system, the interaction events themselves cause significant change in the local group-properties only when the local state is away from that the fish desires to achieve.

A.2.7. Numerical simulations
The codes for the agent-based model were developed in-house and the numerical simulations are carried out in MATLAB ©. We ran simulations for T = 3500s with integration time of dt = 0.05s for 24 realizations. Codes have been made available (see code availability). The range parameter values explored in the study are given in (table 1).

Appendix B. Details of the minimal model
To check the generality of our findings, we modify the detailed model by considering only the parameters necessary to capture the self-propelling agents. We remove components like the agent size, collision avoidance and variable speed. While we preserve the type of interactions as described in the main model, we modify the attraction interaction to its simplest form, as agents now only point objects. Hence, the attraction interaction (A 11) is modified as Further, we also modify the probabilistic nature of interactions, i.e. in this case, all agents interact via all three types of interactions (alignment, attraction and spontaneous change) simultaneously. Nevertheless, the time the agents interact is still chosen stochastically with a constant rate. We find the qualitative features of the results ( pertaining to group cohesion) to be similar to that corresponding to the detailed model (see the electronic supplementary material, S3 for details).
The range parameter values explored in the study are given in (table 2).